A Machine That Learns 

Concerning Machina docilis, descendant oj Machina 
speculatrix, the small imitation oj life that was 
described in the May, 1950, issue of this magazine 


T HIS ARTICLE is a sequel to one 
published here last year which 
described experiments with a sim¬ 
ple little machine designed to mimic 
certain elementary features of animal be¬ 
havior (“An Imitation of Life,” Scien¬ 
tific American, May, 1930). Consist¬ 
ing only of two vacuum tubes, two 
motors, a photoelectric cell and a touch 
contact, all enclosed in a tortoise-shaped 
shell, the model was a species of artificial 
creature which could explore its sur¬ 
roundings and seek out favorable condi¬ 
tions. It was named Machina speculatrix. 
Although it possessed just three simple 
characteristics—the properties of being 
attracted to moderate light and repelled 
by bright fight or by material obstacles— 
M. speculatrix displayed complex and 
unpredictable habits of behavior, resem¬ 
bling in some ways the random variabili¬ 
ty, or “free will,” of an animal’s responses 
to stimuli. But its responses were in no 
way modified by experience; in other 
words, it lacked the power to learn. 

We have gone on from that early 
model to the design of a more advanced 
mechanical creature which does possess 
the ability to learn. The present report 
will describe this new creature, named 
M. docilis from the Latin word meaning 
teachable. 

The mechanism of learning is of 
course one of the most enthralling and 
baffling mysteries in the field of biology. 
In its simplest experimental form modi¬ 
fication of behavior by experience is 
often called “conditioning,” a term sug¬ 
gested by the Russ.an physiologist I. P. 
Pavlov, whose original experiments on 
“conditioned reflexes” brought the study 
of higher nervous function into the realm 
of brain physiology. The basic event in 
this form of learning is that an unrelated 
stimulus, when repeatedly coupled with 
one that evokes a certain response, comes 
to acquire the meaning of the original 
stimulus. In the classical experiments on 
animals the activity used as the basis for 
conditioning was a simple reflex-the 
flow of saliva when food enters the 
mouth, or the withdrawal of a leg when 
a painful stimulus is given to the foot. 
The food or the pain is called the uncon- 
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ditioned, or specific, stimulus. The con¬ 
ditioned, or neutral, stimulus to which 
the animal is trained to respond with the 
same behavior can be any event to 
which the animal is sensitive; a light, a 
sound, a touch—anything at all. If, for 
example, a bell is rung on 10 or 20 occa¬ 
sions just as food is offered, the flow of 
saliva, which originally occurred only 
at the sight of the food, eventually is 
conditioned to begin as soon as the bell 
is rung. After about 20 more repetitions 
the bell alone, without the presence of 
food, evokes almost as copious a flow 
of saliva as does the food itself. One may 
say that the bell comes to “mean” food. 

S UCH learning is of course perfectly 
familiar in ourselves and is the basis 
of all animal training. Indeed it has been 
argued that all learning is based on con¬ 
ditioning, for any bodily function can bo 
made the basis of a conditioned reflex, 
and one conditioned reflex can be built 
on another. Even quite unconscious 
changes, such as quickening of the pulse, 
dilation of the pupils, a rise in blood su¬ 
gar or a fall in temperature, can be "con¬ 
ditioned” to some previously neutral 
stimulus by mere repetition. In this way 
it is possible to obtain control over func¬ 
tions originally quite involuntary. A man 
can ‘learn” to slow his pulse, flush, go 
pale, secrete sugar in his urine and so 
forth by a process of simple condition¬ 
ing. This process may be conscious and 
deliberate, and such training accounts 
for the feats of Yogi fakirs. It may also 
be unconscious and even undesired by 
the subject, sometimes producing “psy¬ 
chosomatic” disorders, in which symp¬ 
toms of bodily disease are attributable to 
nervous strain or conflict. 

In spite of the vast mass of empirical 
information collected by Pavlov and his 
pupils; we still do not understand the 
process whereby the neutral stimulus ac¬ 
quires the meaning of the original one. 
But it is clear that one of the principal 
requirements for this associative learn¬ 
ing is a complex mechanism of memory, 
capable not only of storing the traces of 
the two series of events but also of pro¬ 
viding the information that the coinci¬ 


dence between the two is greater than 
would be expected by chance. The cre¬ 
ation of such a memory mechanism was 
the problem to which we addressed our¬ 
selves in designing M. docilis. 

Our earlier model, M. speculatrix, had 
a very elementary form of memory. In 
order to get around an obstacle it en¬ 
countered, the model had to remember 
it long enough to get well away from the 
hindrance before resuming its journey to 
the attracting light. Even among living 
creatures such a memory is not univer¬ 
sal; the absence or brevity of this mem¬ 
ory accounts for the tireless and ineffec¬ 
tive buzzing of a fly on a windowpane. 
M. speculatrix’s elementary memory 
works as follows: When the model 
touches an obstacle, the contact closes a 
circuit which converts its two-stage am¬ 
plifier into an oscillator of the type 
known as a “multivibrator.” The oscilla¬ 
tions thus generated make the model 
stop, turn, withdraw, and go forward, 
and these maneuvers are repeated until 
the contact is opened by clearance of the 
obstacle. It is a characteristic of this sim¬ 
ple circuit that while it is oscillating it 
cannot amplify, so the model is blind to 
the attracting light while circumventing 
a material difficulty. Furthermore, even 
after the touch contact is opened, one 
more oscillatory discharge takes place, 
and this ensures that the model moves 
well away from the obstacle before re¬ 
gaining its vision. The after-discharge in 
the oscillatory circuit is an example of 
the most elementary form of memory 
trace, in which the internal effect of a 
stimulus outlasts its external duration. 
Such an after-discharge is common in the 
reflex activity of the spinal cord of ani¬ 
mals, and the more complex the reflex, 
the longer the after-discharge is likely to 
last. When you step on a tack, your leg 
is withdrawn by reflex action, but the 
withdrawal continues after your foot has 
left the tack, so that when you straighten 
your leg again it does not come down on 
the same place. 

O N first analysis the problem of trans¬ 
forming M. speculatrix into an 
educable species seemed quite simple. 




essentials are illustrated by the upper 
‘ the t wo diagrams at the lower leit- 
bandside of the next page. In M. specu- 
Itm we had a reflex mechanism with 
three elements: a specific stimulus Ss (a 
liuht or touch), which produced a spe¬ 
cific effect Es (the operation of the motor 
relays) by way of a transmission system 
T (the two-stage amplifier). To intro- 
lice flie factor of conditioning, this 
mechanism must be linked with a second 
■ictivated by a neutral stimulus which 
does not initially produce the effect Es. 
The second arrangement would consist 
of the neutral stimulus Sn and a turns- 
mission system To. (It might produce a 
specific effect of its own, Es.,, but with 
this we are not at the moment con¬ 
cerned.) T t must he linked with To in 
such a way that the former comes to re¬ 
spond to the neutral stimulus with its 
normal effect Es, as if S 11 were in fact Ss. 
This means that there must bo a “learn¬ 
ing box” of some kind between T, and 
T 2 . The question is: What are we to put 
into the learning box (L) ? 

Obviously it must contain an appara¬ 
tus which will receive signals from both 
T and T 3 and combine them in such a 
manner that after Ss and Sn have oc¬ 
curred together more often than they 
would by chance, Sn can find its way 
through the learning box and have the 


effect. Es. We experimented with some 
simple electronic circuits suggested by 
these requirements, hut the first trials 
were disappointing. We soon realized 
that a more detailed analysis of the learn¬ 
ing process would be necessary. It was 
clear that the statistical relation between 
Ss and Sn would have to be assessed be¬ 
fore we could determine how to establish 
an association between them. That is, 
circuits must be provided to deal with 
any particular Ss and Sn in such a way 
that only a significant degree of coinci¬ 
dence between them would be regis¬ 
tered. For example, an animal being 
trained to expect food when a bell is rung 
must first decide whether the ringing of 
the bell is really worth noticing. If bells 
are rung and food is offered entirely at 
random, there is no basis for supposing 
the two to he in any way related. 

It took some time to appreciate the 
number and complexity of the operations 
involved in establishing a connection be¬ 
tween different stimuli to achieve a con¬ 
ditioned response. Eventually it was 
found that no fewer than seven distinct 
operations must he performed. They are: 

1. The beginning of the specific stimu¬ 
lus must be sharply differentiated from 
the absence of the stimulus. That is, it is 
the change that is important, e.g., the 
transition from no food to food in the 


case of an animal, rather than the dura¬ 
tion of the stimulus. 

2. On the other hand, the impact of 
tire neutral stimulus must be extended in 
time. This is because it may occur some 
while before the specific stimulus and 
must therefore he “remembered” long 
enough for its significance to he noticed. 

3. The series of clipped Ss and 
stretched Sn must be mixed in such a 
way that their areas of coincidence are 
appreciated. 

4. The coincident areas must all be 
summated, or integrated, to form a con¬ 
solidated stimulus. 

5. When the sum of all the areas of 
coincidence reaches a value greater than 
would ever be obtained by chance, the 
memory process is activated. This activa¬ 
tion is in the nature of a trigger process— 
a single event, analogous to a flash of in¬ 
sight into a contingency previously ig¬ 
nored. 

6. Once the existence of a significant 
degree of eoincidenee between Ss and Sn 
has been registered, it is preserved in the 
memory for some time and fades away 
gradually. In tire M. docilis model the 
memory takes the form of a damped os¬ 
cillation, but it could well be any me¬ 
chanical, chemical or electrical process 
in which stored energy is slowly released, 
as in the escapement of a watch. It is 



MACHINE SPECULATRIX, photographed by time ex¬ 
posure, is attracted by light in hutch at right. It begins 


at left, encounters obstacle, backs away, encounters on- 
stack, again, backs away again and enters the hutch. 
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essential only that the energy should be 
in such a form that it can he readily 
available for the final operation. 

7. This final phase is the combination 
of the preserved trace with a fresh Sn 
to give Es as tire new conditioned re¬ 
sponse. The operation is analogous to 
the testing by experiment of a hypo die¬ 
sis, the hypothesis here being the likeli¬ 
hood of a correlation between Ss and Sn. 

In terms of conditioned reflexes, the 
acquired response must be reinforced, 
otherwise it will vanish without trace. 
Consequently when the fresh Sn is pre¬ 
sented in the seventh operation, it must 
be followed by the confirming Ss. Even¬ 
tually, after a number of such events, the 
new response Sn —> Es is permanently 
established and requires no further cor¬ 
roboration. 

A LL THIS can he represented in a dia- 
- gram of a simple nervous system 
(see the lower of the two diagrams at 
the lower left-hand side of this page). 
In this drawing there are two series 
of nerve cells—two reflex arcs—which 
correspond to the transmission systems 
Tj and T 2 . Between the two is a net¬ 
work of nerve cells which serve to 
perform the seven operations detailed 
above. Branching off from the first re¬ 
flex arc is a synapse (1) with tire prop¬ 
erty of discharging only at the beginning 
of the stimulus; this corresponds to the 
perception of food. In the second reflex 
arc is a synapse (2) with a long after¬ 
discharge: the prolongation of the neu¬ 
tral stimulus. The signals from the two 
stimuli both reach a neurone at (3), are 
mixed there and added together at (4). 


When the summated inputs reach a cer¬ 
tain level, they discharge a trigger neu¬ 
rone (5). This introduces a pulse into 
the quiescent closed circuit at (6) which, 
by reason of positive feedback, continues 
to oscillate for a long while. An output 
from this leads to a mixing neurone at 
(7), which is also connected directly 
with the second reflex arc. This neurone 
can only discharge when it is activated 
simultaneously by signals from the stor¬ 
age circuit at (6) and a signal from the 
second reflex arc. When it does receive 
signals from both, its discharge is con¬ 
ducted to the output of the first reflex 
and has the specific effect Es. It thus 
acts as a gate to Es—normally shut to Sn 
but opened by the memory that Sn has 
often been followed by Ss. 

Once this scheme had been worked 
out, it became possible to create an elec¬ 
tronic circuit to perform the necessary 
operations (see diagram at the lower 
right-hand side of these two pages). 
The details are perhaps of interest only 
to an electrical engineer; the system in¬ 
volves a number of electronic tubes cou¬ 
pled with capacitors, resistors and so on 
in such a way that the signals are prop¬ 
erly amplified, timed and mixed, and the 
resulting pulses are combined to produce 
die desired results. 

In one arrangement of the working 
model of M. docilis the specific stimulus 
is a moderate light and the neutral one 
is the sound of a whistle. The whistle is 
blown just before the light is seen; after 
this has been repeated 10 or 20 times 
the model has “learned” that the sound 
means light and will come to the whistle 
as though it were a light. If it is teased 


Ss y _ 


T, 

\ Ft* 


\ 

:_ 

7 ts 





Sn N- 

: t 



• 2 

——-^ ts 2 


LEARNING links two systems. Ss and Sn are specific and neutral stimuli; 
Ls and Esg, effects; T x and T 2 , transmission systems; L, learning box. 



CONDITIONED REFLEX requires this arrangement of nerve cells. Num¬ 
bers correspond to operations described in text and to diagram at right. 




by withholding of die light, it soon for 
gets the lesson and disregards the sound 
In another arrangement the specific stim¬ 
ulus is touch, that is, an encounter with 
an obstacle. In that case the whistle is 
blown just as the model comes into con¬ 
tact with the obstacle, so that after a 
while the warning whistle triggers a 
withdrawal and avoidance reaction. This 
process may of course be accelerated by 
formal education: instead of waiting for 
the creature to hit a natural obstacle the 
experimenter can blow the whistle and 
kick the model. After a dozen kicks the 
model will know that a whistle means 
trouble, and it can thus be guided away 
from danger by its master. This last is an 
example of a negative or defensive con¬ 
ditioned reflex; as in an animal, responses 
of this type are more easily established 
and retained than any other. Because the 
mechanism sets up very large oscillating 
pulses which keep feeding into the learn¬ 
ing circuit, the conditioned reflex, once 
established, lasts as long as tire decay 
time of the memory and requires little or 
no reinforcement. 


S EVERAL interesting problems arose 
in the working out of these experi¬ 
ments. For example, the use of sound as 
a conditioned stimulus was convenient, 
but the internal noise of the motors and 
gears was so loud compared with an ex¬ 
ternal sound that the model could not 
“hear” the signal. It was found necessaiy 
to provide a special amplifier with a re¬ 
sistance-capacitance feedback circuit 
sharply tuned to the note of a whistle- 
ubout 3,000 cycles per second. As an 
alternative we tried arranging a muting 
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mechanism whereby the motors were 
turned off periodically and the nucro- 
houe was simultaneously switched on 
[or a moment to pick up any extraneous 
sound. This type of gating mechanism 
emphasizes the importance ol the 
stretching operation applied to the sound 
signal for the information the latter con¬ 
veys is used after the brief listening pe¬ 
riod which mav occur only mice a sec¬ 
ond for a tenth of a second. The muting- 
vmlse device was not adopted because it 
seemed more complicated than the 
sharply tuned amplifier, hut the iormer 
may be more akin to the physiological 
mechanisms in living creatures. 

Further complications in M. docilis 
arise when the sound amplifier (neutral 
stimulus) is arranged to produce its own 
specific effect. For example, it can easily 
be arranged to make the sound switch 
off all motors, so that the model “freezes” 
when it hears the whistle. Such a reac¬ 
tion is very common in animals; many 
marsupials and rodents play possum 
when they.hear a strange noise. If now 
it is intended to tench the model that 
sound means light, which may mean 
food, the freezing reaction must ho in¬ 
hibited to permit conditioning of the new 
response. A separate branch must there¬ 
fore be taken from the output of the mix¬ 
ing tube at (7) to the output of the 
sound amplifier, whereby the instinc¬ 
tive” effect of the latter is suppressed as 
soon ns the positive conditioning has 
been established. 

V Ehave described so far the simplest 
possible mechanism, consisting only 
of a single learning circuit connected to 


two signal amplifiers. With this arrange¬ 
ment the model is reasonably docile. But 
if vve introduce a second learning circuit, 
or build in two neutral or specific signals 
instead of one, it becomes only too easy 
to establish an experimental neurosis. 
Thus if the arrangement is such that the 
sound becomes positively associated both 
witlr the attracting light and with the 
withdrawal from an obstacle, it is pos¬ 
sible for both a light and a sound to 
set up a paradoxical withdrawal. The 
“instinctive” attraction to a light is abol¬ 
ished and the model can no longer ap¬ 
proach its source of nourishment. This 
state seems remarkably similar to the 
neurotic behavior produced in human 
beings by exposure to conflicting influ¬ 
ences or inconsistent education. In the 
model such ineffective and even destruc¬ 
tive conditions can he terminated by rest, 
hv switching off or by disconnecting one 
of the circuit's. These treatments seem 
analogous to the therapeutic devices of 
the psychiatrist—sleep, shock and psy- 
ehosurgery. 

In M. docilis the memory of associa¬ 
tion is formed by electric oscillations in a 
feedback circuit. The decay of these os¬ 
cillations is analogous to forgetting; their 
evocation, to recall. If several learning 
pathways are introduced, the creature’s 
oscillatory memory becomes endowed 
with a very valuable feature; the fre¬ 
quency of eacli oscillation, or memory, is 
its identity tag. A latent memory can be 
detected and identified among others by 
a process of frequency analysis, and a 
complex of memories can he represented 
as a synthesis of oscillations which yields 
a characteristic wave pattern. Further¬ 


more a “memory” can be evoked by an 
internal signal at the correct frequency, 
which resonates with the desired oscilla¬ 
tion. The implications of these effects are 
of considerable interest to those who 
study the brain, for rhythmic electrical 
oscillation is the prime feature of brain 
activity. We may gain new respect for 
the speculations of the English physi¬ 
cian-philosopher David Hartley, who 
200 years ago suggested that ideas were 
represented in the brain as vibrations 
and “vibratiuncles.” 

T HESE models are of course so simple 
that any more detailed comparison 
between them and living creatures would 
he purely conjectural. Experiments with 
larger numbers of circuits are perfectly 
feasible and will certainly be instinctive. 
One weakness of more elaborate systems 
can be predicted with confidence: ex¬ 
treme plasticity cannot be gained with¬ 
out some loss of stability. In the real 
world an animal must be prepared to 
associate almost any event with almost 
any other; this means that if a nervous 
system contains N specific receptor-ef¬ 
fector pathways, it should also include 
something of die order of N 2 —N learn¬ 
ing circuits. In such a system the chances 
of stability decline rapidly as N in¬ 
creases. It is therefore no wonder that 
the incidence of neuropsychiatric com¬ 
plaints marches with intellectual attain¬ 
ment and social complexity. 


W. Grey ’Walter is director of the phys¬ 
iological department at the Burden Neu¬ 
rological Institute in Bristol, England. 



lined by simplified diagram. The circuit element labeled only s ,° U " < j. second” provides^achine with memory. 

“3,000 cycles per second” is tuned so that Cora responds one cycle per sccom p 
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